European Clinical Case Corpus
نویسندگان
چکیده
Abstract Interpreting information in medical documents has become one of the most relevant application areas for language technologies. However, despite fact that huge amounts (e. g., examination reports, hospital discharge letters, digital records) are produced, their availability research purposes is still limited, due to strict data protection regulations. Aiming at fostering advanced extraction technologies applications, we present E3C, a corpus clinical case narratives fully based on freely licensed documents. E3C (European Clinical Case Corpus) contains vast selection cases (i. e., presenting patient’s history) cover different areas, styles and produced languages. A portion been manually annotated be used training testing purposes, while larger set automatically tagged serve as baseline future extraction.
منابع مشابه
The Sentence-Aligned European Patent Corpus
This paper describes the creation and the content of the Sentence-Aligned European Patent Corpus. The corpus contains more than 130 million sentence pairs for 6 European languages. With more than 76 million sentence pairs, to our knowledge, the EN-DE sub corpus is the largest bilingual sentence-aligned corpus. For other language pairs, work has started to obtain sub corpora of similar size. The...
متن کاملDCEP -Digital Corpus of the European Parliament
We are presenting a new highly multilingual document-aligned parallel corpus called DCEP Digital Corpus of the European Parliament. It consists of various document types covering a wide range of subject domains. With a total of 1.37 billion words in 23 languages (253 language pairs), gathered in the course of ten years, this is the largest single release of documents by a European Union institu...
متن کاملcorpus stylistics and translation toni morrisons beloved as a case study
سبک شناسی به عنوان روشی جهت فهم و برداشت از یک متن ادبی با رویکردهای متفاوتی به تحلیل متن می پردازد که در نتیجه آن شاخه های مختلفی از سبک شناسی به وجود آمده است از جمله سبک شناسی فرمالیستی، سبک شناسی اداراکی، سبک شناسی مبتنی بر کورپس و غیره. این تحقیق در نخستین گام به دنبال نشان دادن اهمیت و سودمندی سبک شناسی مبتنی بر کورپس در تحلیل و بررسی ویژگی های سبکی یک اثر ادبی است و برای این منظور به صور...
Priberam Compressive Summarization Corpus: A New Multi-Document Summarization Corpus for European Portuguese
In this paper, we introduce the Priberam Compressive Summarization Corpus, a new multi-document summarization corpus for European Portuguese. The corpus follows the format of the summarization corpora for English in recent DUC and TAC conferences. It contains 80 manually chosen topics referring to events occurred between 2010 and 2013. Each topic contains 10 news stories from major Portuguese n...
متن کاملTagging a Corpus of Interpreted Speeches: the European Parliament Interpreting Corpus (EPIC)
The performance of three different taggers (Treetagger, Freeling and GRAMPAL) is evaluated on three different languages, i.e. English, Italian and Spanish. The materials are transcripts from the European Parliament Interpreting Corpus (EPIC), a corpus of original (source) and simultaneously interpreted (target) speeches. Owing to the oral nature of our materials and to the specific characterist...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Cognitive technologies
سال: 2022
ISSN: ['2197-6635', '1611-2482']
DOI: https://doi.org/10.1007/978-3-031-17258-8_17